Multilabel Text Classification with Label-Dependent Representation

نویسندگان

چکیده

Assigning predefined classes to natural language texts, based on their content, is a necessary component in many tasks organizations. This task carried out by classifying documents within set of categories using models and computational methods. Text representation for classification purposes has traditionally been performed vector space model due its good performance simplicity. Moreover, the texts via multilabeling typically approached simple label methods, which require transformation problem studied apply binary techniques, or adapting algorithms. Over previous decade, text extended deep learning models. Compared traditional machine avoids rule design feature selection humans, automatically provides semantically meaningful representations analysis. However, learning-based data-intensive computationally complex. Interest does not techniques shallow learning. situation true when training cases smaller, features small. White box approaches have advantages over black approaches, where feasibility working with relatively small sets data interpretability results stand out. research evaluates weighting function words modify during multilabel classification, combination two approaches: adaptation. was tested 10 referential textual sets, compared alternative three measures: Hamming Loss, Accuracy, macro-F1. The best improvement occurs macro-F1 fewer labels, documents, smaller vocabulary sizes. In addition, improves higher cardinality, density, diversity labels. proves usefulness sets. show improvements more than 10% terms classifiers our method almost all analyzed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multilabel Classification with Principal Label Space Transformation

We consider a hypercube view to perceive the label space of multilabel classification problems geometrically. The view allows us not only to unify many existing multilabel classification approaches but also design a novel algorithm, principal label space transformation (PLST), that captures key correlations between labels before learning. The simple and efficient PLST relies on only singular va...

متن کامل

Flexible Text Segmentation with Structured Multilabel Classification

Many language processing tasks can be reduced to breaking the text into segments with prescribed properties. Such tasks include sentence splitting, tokenization, named-entity extraction, and chunking. We present a new model of text segmentation based on ideas from multilabel classification. Using this model, we can naturally represent segmentation problems involving overlapping and non-contiguo...

متن کامل

Label Filters for Large Scale Multilabel Classification

When assigning labels to a test instance, most multilabel and multiclass classifiers systematically evaluate every single label to decide whether it is relevant or not. This linear scan over labels becomes prohibitive when the number of labels is very large. To alleviate this problem we propose a two step approach where computationally efficient label filters pre-select a small set of candidate...

متن کامل

Multilabel Classification with Label Correlations and Missing Labels

Many real-world applications involve multilabel classification, in which the labels can have strong interdependencies and some of them may even be missing. Existing multilabel algorithms are unable to handle both issues simultaneously. In this paper, we propose a probabilistic model that can automatically learn and exploit multilabel correlations. By integrating out the missing information, it ...

متن کامل

Multilabel Classification Exploiting Coupled Label Similarity with Feature Selection

In multilabel classification each example is represented with features and associated with multiple labels. Multilabel classification aims to predict set of labels for unseen instances. Researchers have developed multilabel classification using both the problem transformation approach and algorithm adaptation approach. An algorithm called MLkNN that follows algorithm adaptation approach has bee...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied sciences

سال: 2023

ISSN: ['2076-3417']

DOI: https://doi.org/10.3390/app13063594